System for determination of temperature cycling protocols for polymerase chain reactions:plymerase extension

ABSTRACT

In one aspect, methods are described herein for enhancing one or more nucleic acid interactions. For example, in some embodiments, methods of enhancing one or more steps of polymerase chain reaction (PCR) are described herein. In some embodiments, the optimal temperature cycling protocol for one or more PCR cycles can be determined according to methods described herein.

RELATED APPLICATION DATA

The present application is a continuation application of U.S. patent application Ser. No. 14/892,924 filed Nov. 20, 2015, which is a United States National Phase of PCT/US2014/039251, filed May 22, 2014, which claims priority pursuant to 35 U.S.C. § 119(e) to U.S. Provisional Patent Application Ser. No. 61/826,269 filed May 22, 2013 and U.S. Provisional Patent Application Ser. No. 61/982,108 filed Apr. 21, 2014, each of which is incorporated by reference in its entirety.

BACKGROUND

The polymerase chain reaction (PCR), the most common DNA amplification reaction, is a cyclic amplification process that can produce millions of copies of double-stranded DNA molecules starting from a single molecule. The traditional PCR reaction cycle consists of three temperature steps that induce (i) double-stranded DNA denaturation, (ii) oligonucleotide primer annealing to the resulting single-stranded DNAs, and (iii) polymerase enzyme-mediated extension to produce two double-stranded DNA molecules. This cycle is repeated 20-30 times, resulting in geometric growth of the number of DNA molecules. Despite the apparent simplicity of the PCR reaction, standard PCR temperature cycling approaches are often fraught with difficulties that can decrease the average cycle efficiency or result in competitive amplification of undesired side products. Due to geometric amplification, this can result in dramatically diminished efficiency and specificity of the overall reaction.

SUMMARY

In one aspect, methods are described herein for enhancing one or more nucleic acid interactions. For example, in some embodiments, methods of enhancing one or more steps of polymerase chain reaction (PCR) are described herein. In some embodiments, the optimal temperature cycling protocol for one or more PCR cycles can be determined according to methods described herein. In some embodiments, a computer implemented method for enhancing annealing efficiency of first and second single strands of deoxyribonucleic acid (DNA) over a predetermined time period is described herein, the method comprising accessing a memory via a computing device processor, wherein the memory comprises computer instructions that when operated by the computing device processor cause the computing device processor to receive information comprising sequences of the first and second DNA strands, concentrations of the first and second DNA strands, length of the predetermined time period and a set of annealing reaction temperatures; determine the forward reaction rate constant, k_(f), of annealing the first and second single strands of DNA at annealing reaction temperatures in the set; determine the reverse reaction rate reaction rate constant, k_(r), of annealing the first and second single strands of DNA at annealing reaction temperatures in the set; and select one or more of the annealing reaction temperatures sufficient to produce a DNA concentration from the annealed first and second strands in excess of a predetermined threshold within the predetermined time period.

In another embodiment, a computer implemented method for enhancing annealing efficiency of first and second single strands of deoxyribonucleic acid (DNA) at a predetermined temperature is described herein, the method comprising accessing a memory via a computing device processor, where the memory comprises computer instructions that when operated by the computing device processor causes the computing device processor to receive information comprising sequences of the first and second DNA strands, concentrations of the first and second DNA strands, the predetermined annealing reaction temperature and a set of annealing reaction time periods; determine the forward reaction rate constant, k_(f), of annealing the first and second single strands of DNA; determine the reverse reaction rate reaction rate constant, k_(r) , of annealing the first and second single strands of DNA; and select one or more of the annealing reaction time periods sufficient to produce a DNA concentration from the annealed first and second strands in excess of a predetermined threshold at the predetermined annealing reaction temperature.

In another aspect, a computer implemented method for enhancing amplification efficiency of a sequence of deoxyribonucleic acid (DNA) in a first cycle of polymerase chain reaction (PCR) is described herein, the method comprising accessing a memory via a computing device processor, where the memory comprises computer instructions that when operated by the computing device processor causes the computing device processor to (a) receive information comprising sequences of single strand DNA and primer, concentrations of the single strand DNA and primer, length of a predetermined time period for annealing the single strand DNA and primer and a set of annealing reaction temperatures; (b) determine the forward reaction rate constant, k_(f), of annealing the single strand DNA and primer at annealing reaction temperatures in the set; (c) determine the reverse reaction rate reaction rate constant, k^(r), of annealing the single strand DNA and primer at reaction temperatures in the set; and (d) select one or more of the annealing reaction temperatures sufficient to produce a primer hybridized DNA strand concentration in excess of a predetermined threshold within the predetermined time period for annealing.

In some embodiments, the computing device further (a′) determines a time period for duplex DNA melting sufficient to produce a concentration of single strand DNA in excess of a predetermined threshold. Determining the time period for duplex DNA melting can comprise determining forward and reverse rate constants of duplex DNA melting by at least one of:

(1) identification of different melting domains for a given DNA sequence using an algorithmic method;

(2) ascending ordering of domains based on melting temperature, wherein the equilibrium constant of each domain as a function of temperature is calculated via the equation:

$K_{{eq},{dom}} = {\sigma_{c}{f(N)}{\prod\limits_{i = 1}^{N}\; S_{i}}}$

where N is the domain length, s_(i) is the equilibrium constant for formation of base pair i and σ_(c) is the cooperativity parameter that constitutes a penalty on the statistical weight

$\prod\limits_{i = 1}^{N}\; S_{i}$

for melting of the domain due to the free energy cost of dissociation of an internal base pair, and f(N) is a loop closure function specific to the type of domain that introduces a length dependence on the free energy cost;

(3) construction of state space matrices based on a two-sided or one sided matrix for each domain;

(4) association of σ, the nucleation parameter with individual states in a last block as in the matrix below;

(5) diagonalization of each block and rank ordering of eigenvalues of all blocks;

(6) determining relaxation time, τ, according to 1/max(λ_(i)), wherein max(λ_(i)) denotes the largest eigenvalue of an annealing reaction matrix of the form (here depicted for a two base sequence)

$A = \begin{bmatrix} {- \left( {k^{- 1} + k_{- 1}} \right)} & k_{1} & k_{1} & 0 \\ k_{- 1} & {- \left( {k_{- 2} + k_{1}} \right)} & 0 & {{\sigma \left\lbrack D_{0,{2{eq}}} \right\rbrack}k_{1}} \\ k^{- 1} & 0 & {- \left( {k^{- 2} + k_{1}} \right)} & {{\sigma \left\lbrack D_{0,{2{eq}}} \right\rbrack}k_{1}} \\ 0 & k_{- 2} & k^{- 2} & {{- 2}{\sigma \left\lbrack D_{0,{2{eq}}} \right\rbrack}k_{1}} \end{bmatrix}$

wherein k₁=k₂ which equal the forward rate constant for formation of a single base pair and k_(−i) is the reverse rate constant for the formation of the single base pair wherein s_(i)=k₁/k_(−i) where s_(i) is the equilibrium constant for formation of base pair i, and σ is a nucleation parameter that measures resistance to formation of the first base pair.

(7) comparing all relaxation times and identifying the maximum value of the relaxation time as the relaxation time of the whole DNA sequence

(8) computing forward and reverse rate constants k_(m),k_(−m) for DNA melting based on the relaxation time and the Gibbs free energy of melting.

Additionally, in some embodiments, the computing device processor further (e) determines a time period for DNA extension sufficient to produce a concentration of DNA in excess of a predetermined threshold. Determining the time period for DNA extension can comprise determining the forward reaction rate constant, k_(e), of DNA polymerase binding and determining the reverse reaction rate constant, k_(−e), of DNA polymerase binding. k_(e) and k_(−e) can be determined according to the processivity of the polymerase and the Gibbs free energy for the DNA polymerase binding reaction through the relationships

$K_{binding} = \frac{k_{e}}{k_{- e}}$

and:

$k_{- e} = {{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left( \frac{1}{{E\left\lbrack i_{off} \right\rbrack} - 1} \right)}$

where i indexes the sequence position on the template,

$\frac{k_{cat}}{K_{N}}$

is the apparent second order rate constant of nucleotide addition, and

${E\left\lbrack i_{off} \right\rbrack} = \frac{1}{1 - p}$

is the processivity or expected position of dissociation of the polymerase, with p denoting the microscopic processivity parameter, the microscopic processivity parameter being the conditional probability of the polymerase not dissociating at position/time i, given that it was bound to the template at position i−1. Additionally, determining the time period for DNA extension can further comprise determining the enzyme bound hybridized DNA strand concentration, [E.SP] and the concentrations of all enzyme bound partially extended DNA molecules according to:

$\begin{matrix} {{\frac{d}{dt}\left\lbrack {E.{SP}} \right\rbrack} = {{{k_{e}\lbrack E\rbrack}\lbrack{SP}\rbrack} - {k_{- e}\left\lbrack {E.{SP}} \right\rbrack} - {{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left\lbrack {E.{SP}} \right\rbrack}}} \\ {{{\frac{d}{dt}\left\lbrack {E.D_{i}} \right\rbrack} = {{{k_{e}\lbrack E\rbrack}\left\lbrack D_{i} \right\rbrack} - {k_{- e}\left\lbrack {E.D_{i}} \right\rbrack} + {{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left\{ {\left\lbrack {E.D_{i - 1}} \right\rbrack - \left\lbrack {E.D_{i}} \right\rbrack} \right\}}}},} \\ {{i = 1},\ldots \;,{n - 1}} \end{matrix}$

where [N] is the nucleotide concentration, [E.D_(i)] is the concentration of enzyme bound primer-template DNA with i added nucleotides, and n denotes the number of bases in the DNA sequence. K_(N) and k_(cat) can be theoretically determined or determined from a plot of 1/v versus 1/[N] wherein the slope of the plot is approximately

$\frac{K_{N}}{k_{cat}E_{0}}$

and the y-intercept is

$\frac{1}{k_{cat}E_{0}},$

wherein E₀ is the initial concentration of DNA polymerase and v is an experimentally or theoretically determined rate of reaction for nucleotide addition.

Amplification of the DNA concentration in excess of a predetermined threshold can be achieved through an algorithm that solves an optimization problem of the form

${\min\limits_{T{(t)}}J},{J = {\sum\limits_{i = 1}^{{4n} + 4}{x_{i}\left( t_{f} \right)}}}$

subject to the constraint

${\frac{dx}{dt} = {f\left( {{x(t)},{T(t)},k} \right)}},{{x(0)} = x_{0}}$

where T(t) denotes the temperature of the reaction mixture as a function of reaction time, x=[[S₁],[S₁P₁],[E.S₁P₁],[E.D₁ ¹], . . . , [DNA],[P₁],[P₂],[N],[E]]^(Tr) is the vector of concentrations of all reaction species, x₀ denotes the vector of initial concentrations, t_(f) denotes the predetermined time for a single PCR cycle, and f(x(t), T(t), k) denotes the right-hand side of the system of rate equations for primer annealing, enzyme binding and polymerase-mediated extension and k is a vector of all kinetic rate constants appearing in the system of PCR rate equations. f(x(t),T(t), k) can be expressed in terms of the sequence-dependent rate constants k_(i) appearing in

$\left\lbrack {k_{m},k_{- m},k_{f}^{1},k_{r}^{1},k_{r}^{2},k_{1}^{e},k_{- 1}^{e},\frac{k_{cat}}{K_{N}},{k_{cat}}^{\prime}} \right\rbrack \mspace{14mu} {as}\mspace{14mu} {\sum\limits_{i}{{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}$

g_(i)(x(t)) denotes the part of the right-hand side of the system of rate equations for DNA melting, primer annealing, enzyme binding and polymerase-mediated extension associated with rate constant k_(i). Additionally, the optimization problem, in such embodiments, can be formulated as a system of differential-algebraic equations (DAE) of:

$\begin{matrix} {{H\left\lbrack {{x(t)},{\varphi (t)},{T(t)}} \right\rbrack} = {{\varphi^{Tr}(t)}{\sum\limits_{i = 1}^{10}{{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}}} \\ {\frac{d\; \varphi}{dt} = {{- {\nabla_{x}{H\left\lbrack {{x(t)},{\varphi (t)},{T(t)}} \right\rbrack}}} = {- {\sum\limits_{i = 1}^{10}{{k_{i}\left( {T(t)} \right)}\left( \frac{\partial{g_{i}\left( {x(t)} \right)}}{\partial{x(t)}} \right)^{Tr}{\varphi (t)}}}}}} \\ {{\varphi \left( t_{f} \right)} = {\left\lbrack {\nabla_{x}{F\left( {x(t)} \right)}} \right\rbrack_{t = t_{f}} = \left\lbrack {\nabla_{x}{\sum\limits_{i = 1}^{{4n} + 4}{x_{i}(t)}}} \right\rbrack_{t = t_{f}}}} \\ {{{\frac{\partial H}{\partial{T(t)}}\left\lbrack {{x^{*}(t)},{\varphi^{*}(t)},{T^{*}(t)}} \right\rbrack} = {{{{\varphi^{*{,{Tr}}}(t)}{\sum\limits_{i = 1}^{10}\frac{\partial{k_{i}\left( {T(t)} \right)}}{\partial{T(t)}}}}_{T^{*}{(t)}}{g_{i}\left( {x^{*}(t)} \right)}} = 0}},} \\ {\forall{t \in \left\lbrack {0,t_{f}} \right\rbrack}} \end{matrix}$

where H denotes the optimal control Hamiltonian function and ϕ(t)denotes the optimal control costate vector.

Alternatively, the optimal control problem, in some embodiments, is of the form:

${\min\limits_{T{(t)}}J},{J = t_{f}}$

subject to the constraints

$\begin{matrix} \left. i \right) & {{\frac{dx}{dt} = {f\left( {{x(t)},{T(t)},k} \right)}},{{x(0)} = x_{0}}} \\ \left. {ii} \right) & {{\lbrack{DNA}\rbrack \left( t_{f} \right)} = \lbrack{DNA}\rbrack_{target}} \end{matrix}$

where T(t) denotes the temperature of the reaction mixture as a function of reaction time, is the vector of concentrations of all reaction species, t_(f) denotes the predetermined time for a single PCR cycle, and f(x(t T(t), k) denotes the right-hand side of the system of rate equations for primer annealing, enzyme binding and polymerase-mediated extension and k is a vector of all kinetic rate constants appearing in the system of rate equations. As described above, f(x(t),T(t), k) can be expressed in terms of the sequence-dependent rate constants k_(i) appearing

${{in}\mspace{14mu}\left\lbrack {k_{m},k_{- m},k_{f}^{1},k_{r}^{1},k_{f}^{2},k_{r}^{2},k_{e},k_{- e},\frac{k_{cat}}{K_{N}},{k_{cat}}^{\prime}} \right\rbrack}\; {\quad\mspace{11mu} {as}\quad}{\quad\mspace{11mu} {\sum\limits_{i}{{k_{i}\left( {T(t)} \right)}{{g_{i}\left( {x(t)} \right)}.}}}}$

In such embodiments, the optimization problem can be formulated as a system of differential-algebraic equations (DAE) of:

$\begin{matrix} {{H\left\lbrack {{x(t)},{T(t)},{\varphi (t)}} \right\rbrack} = {{{L\left\lbrack {{x(t)},{T(t)}} \right\rbrack} + {{\varphi^{Tr}(t)}{\sum\limits_{i = 1}^{10}{{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}}} =}} \\ {1 + {{\varphi^{Tr}(t)}{\sum\limits_{i = 1}^{10}{{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}}} \\ {\frac{d\; \varphi}{dt} = {{- {\nabla_{x}H}} = {- {\sum\limits_{i = 1}^{10}{{k_{i}\left( {T(t)} \right)}\left( \frac{\partial{g_{i}\left( {x(t)} \right)}}{\partial{x(t)}} \right)^{Tr}{\varphi (t)}}}}}} \\ {{{\frac{\partial H}{\partial{T(t)}}\left\lbrack {{x^{*}(t)},{\varphi^{*}(t)},{T^{*}(t)}} \right\rbrack} = {{{{\varphi^{*{,{Tr}}}(t)}{\sum\limits_{i = 1}^{10}\frac{\partial{k_{i}\left( {T(t)} \right)}}{\partial{T(t)}}}}_{T^{*}{(t)}}{g_{i}\left( {x^{*}(t)} \right)}} = 0}},} \\ {\forall{t \in \left\lbrack {0,t_{f}} \right\rbrack}} \\ {{{\varphi_{i}\left( t_{f} \right)} = 0},{i = 1},\ldots \;,{{4n} + 4}} \\ {{{x_{{4n} + 5}\left( t_{f} \right)} = x_{{{4n} + 5},f}},{x_{{{4n} + 5},f} > {2^{m}{{x_{{4n} + 5}(0)}.}}}} \\ {{H_{tf}} = {{{0{\varphi^{Tr}\left( t_{f} \right)}}{\sum\limits_{i = 1}^{10}{{k_{i}\left( {T\left( t_{f} \right)} \right)}{g_{i}\left( {x\left( t_{f} \right)} \right)}}}} = {- 1}}} \end{matrix}$

where H denotes the optimal control Hamiltonian function, ϕ(t) denotes the optimal control costate vector, and * denotes the solution to the optimal control problem.

Moreover, the instructions can cause the computing device processor to repeat steps (a)-(e) for enhancing amplification efficiency of the sequence of DNA in at least one additional cycle of PCR.

In another aspect, a computer implemented method is described herein for determining the temperature cycling protocol for a PCR reaction that achieves amplification of DNA concentration in excess of a predetermined threshold, the method comprising:

accessing a memory via a computing device processor, where the memory comprises computer instructions that when operated by the computing device processor causes the computing device processor to:

-   -   a) receive information comprising sequences of single strand         DNAs and primers, concentrations of the single strand DNAs and         primers, the type of polymerase enzyme, the polymerase enzyme         concentration, and the nucleotide concentration;     -   b) determine the forward reaction rate constant, k_(f)(T), of         annealing each single strand DNA and primer as a function of         temperature T;     -   c) determine the reverse reaction rate constant, kr(T), of         annealing each single strand DNA and primer as a function of         temperature T;     -   d) determine the rate constant, k_(e)(T), of polymerase enzyme         binding to the hybridized primer-template DNA as a function of         temperature T     -   e) determine the rate constant, k_(−e)(T), polymerase enzyme         dissociation from the hybridized primer-template DNA as a         function of temperature T     -   f) determine the Michaelis constant, K_(N)(T), for nucleotide         binding to the enzyme-primer-template DNA complex E.SP as a         function of temperature T     -   g) determine the catalytic rate constant, k_(cat)(T), of         nucleotide addition to the enzyme-primer-template DNA complex         E.SP as a function of temperature T     -   h) determine the forward rate constant, k_(m)(T), for dsDNA         melting for the complete sequence as a function of temperature T     -   i) determine the reverse rate constant, k_(−m)(T), for dsDNA         melting for the complete sequence as a function of temperature T

solve an optimization problem of the form

${\min\limits_{T{(t)}}\; J},{J = {\sum\limits_{i = 1}^{{4n} + 4}{x_{i}\left( t_{f} \right)}}}$

subject to the constraint

${\frac{dx}{dt} = {f\left( {{x(t)},{T(t)},k} \right)}},{{x(0)} = x_{0}}$

where T(t) denotes the temperature of the reaction mixture as a function of reaction time, x=[[S₁],[S₁P₁],[E.S₁P₁],[D₁ ¹],[E.D₁ ¹], . . . , [DNA],[P₁],[P₂],[N],[E]]^(Tr) is the vector of concentrations of all reaction species, x₀ denotes the vector of initial concentrations, t_(f) denotes the predetermined time for a single PCR cycle, and f(x(t), T(t), k) denotes the right-hand side of the system of rate equations for primer annealing, enzyme binding and polymerase-mediated extension and k is a vector of all kinetic rate constants appearing in the system of PCR rate equations. f(x(t), T(t), k) can be expressed in terms of the sequence-dependent rate constants k_(i) appearing in

$\left\lbrack {k_{m},k_{- m},k_{f}^{1},k_{r}^{1},k_{f}^{2},k_{r}^{2},k_{e},k_{- e},\frac{k_{cat}}{K_{N}},{k_{cat}}^{\prime}} \right\rbrack \mspace{14mu} {as}\mspace{14mu} {\sum\limits_{i}{{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}$

g_(i)(x(t)) denotes the part of the right-hand side of the system of rate equations for DNA melting, primer annealing, enzyme binding and polymerase-mediated extension associated with rate constant k_(i). Alternatively, the solved optimization problem can be in the form:

${\min\limits_{T{(t)}}\; J},{J = t_{f}}$

subject to the constraints

${{\left. {{{{\left. {ii} \right)\mspace{14mu} \frac{dx}{dt}} = {f\left( {{x(t)},{T(t)},k} \right)}},{{x(0)} = x_{0}}}{ii}} \right)\mspace{14mu}\lbrack{DNA}\rbrack}\left( t_{f} \right)} = \lbrack{DNA}\rbrack_{target}$

where T(t) denotes the temperature of the reaction mixture as a function of reaction time, is the vector of concentrations of all reaction species, t_(f) denotes the predetermined time for a single PCR cycle, and f(x(t),T(t), k) denotes the right-hand side of the system of rate equations for primer annealing, enzyme binding and polymerase-mediated extension and k is a vector of all kinetic rate constants appearing in the system of rate equations. As described above, f(x(t),T(t), k) can be expressed in terms of the sequence-dependent rate constants k_(i) appearing

${in}\mspace{14mu}\left\lbrack {k_{m},k_{- m},k_{f}^{1},k_{r}^{1},k_{f}^{2},k_{r}^{2},k_{1}^{e},k_{- 1}^{e},\frac{k_{cat}}{K_{N}},k_{cat}^{\prime}} \right\rbrack$ ${as}\mspace{14mu} {\sum\limits_{i}{{k_{i}\left( {T(t)} \right)}{{g_{i}\left( {x(t)} \right)}.}}}$

In another aspect, a computer-readable medium storing computer-executable instructions is described herien that, when executed, cause a processor of a computing device to perform a method comprising receiving information comprising sequences of first and second single strands of DNA, concentrations of the first and second single strands of DNA, a predetermined reaction time period for annealing the first and second single strands of DNA and a set of annealing reaction temperatures; determining the forward reaction rate constant, k_(f), of annealing the first and second single strands of DNA at annealing reaction temperatures in the set; determining the reverse reaction rate reaction rate constant, k_(r), of annealing the first and second single strands of DNA at reaction temperatures in the set; and selecting one or more of the annealing reaction temperatures sufficient to produce a DNA concentration from the annealed first and second strands in excess of a predetermined threshold within the predetermined time period.

In another aspect, a system for enhancing annealing efficiency of first and second single strands of deoxyribonucleic acid (DNA) over a predetermined time period is described herein, the system comprising a memory device; and a processor coupled to the memory device, the processor configured to receive information comprising sequences of first and second single strands of DNA, concentrations of the first and second single strands of DNA, a predetermined time period for annealing the first and second single strands of DNA and a set of annealing reaction temperatures; determine the forward reaction rate constant, k_(f), of annealing the first and second single strands of DNA at annealing reaction temperatures in the set; determine the reverse reaction rate reaction rate constant, k_(r), of annealing the first and second single strands of DNA at annealing reaction temperatures in the set; and select one or more of the annealing reaction temperatures sufficient to produce a DNA concentration from the annealed first and second strands in excess of a predetermined threshold within the predetermined time period.

In a further aspect, a system for enhancing amplification efficiency of a sequence of DNA in a first cycle of PCR is described herein, the system comprising a memory device and a processor coupled to the memory device, the processor configured to (a) receive information comprising sequences of single strand DNA and primer, concentrations of the single strand DNA and primer, length of a predetermined time period for annealing the single strand DNA and primer and a set of annealing reaction temperatures; (b) determine the forward reaction rate constant, k_(f), of annealing the single strand DNA and primer at annealing reaction temperatures in the set; (c) determine the reverse reaction rate reaction rate constant, k_(r), of annealing the single strand DNA and primer at reaction temperatures in the set; and (d) select one or more of the annealing reaction temperatures sufficient to produce a primer hybridized DNA strand concentration in excess of a predetermined threshold within the predetermined time period for annealing. In some embodiments, the processor is further configured to (e) determine a time period for DNA extension sufficient to produce a concentration of DNA in excess of a predetermined threshold as described herein.

These and other aspects are further described in the detailed description which follows.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 illustrates the effect of temperature on forward rate constant, k_(l), of each step of oligonucleotide hybridization reaction.

FIG. 2 illustrates (a) one-sided and (b) two-sided melting in conjunction with reaction mechanism R_(l) described herein.

FIG. 3 illustrates comparison between theoretically and experimentally estimates forward (k_(f)) and reverse (k_(r)) for single stranded (S₁, S₂) hybridization reaction.

FIG. 4 illustrates determination of DNA concentration in excess of a predetermined threshold for a set of annealing reaction temperatures and time periods.

FIG. 5 is a flow chart detailing steps for determination of forward and reverse rate constants for DNA strand hybridization according to one embodiment described herein.

FIG. 6 illustrates variation of the activation energy of the single base pair forward hybridization reaction with respect the sequence length N.

FIG. 7 represents a general reaction scheme for enzyme binding and extension reactions according to some embodiments described herein.

FIG. 8 identifies the steps involved in enzyme binding and extension model parameter estimation according to some embodiments described herein.

FIG. 9 illustrates the Michaelis-Menten plots of the activity of the Taq polymerase at various temperatures.

FIG. 10 provided temperature dependent extension rate constant

$\frac{k_{cat}}{K_{N}}$

according to some embodiments described herein.

FIG. 11 illustrates k_(−e) of DNA extension at three different temperatures according to some embodiments described herein.

FIG. 12 provides estimates of k_(e) based on temperature dependence of k_(−e) as illustrated in the Arrhenius plot.

FIG. 13 provides temperatures and reaction times for melting, annealing and extension steps of a first cycle of PCR according to one embodiment described herein.

FIG. 14 illustrates solutions to the system of rate equations for the entire PCR cycle for the three different annealing temperatures.

FIG. 15 compares the geometric growth of DNA with respect to reaction time under the above three reaction conditions.

FIG. 16 illustrates the modified temperature protocol based on this analysis.

FIG. 17 illustrates optimal cycle efficiency at the three different reaction conditions.

FIG. 18 provides g(x) for a simplex PCR with n=2.

FIG. 19 illustrate control vector iteration to solve the optimal control problem.

FIG. 20 illustrates the optimal DNA concentration profile obtained by solving PCR optimal control problem using control vector iteration according to one embodiment described herein.

FIG. 21 illustrates the optimal temperature profile obtained by solving PCR optimal control problem using control vector iteration according to one embodiment described herein.

FIG. 22(a)-(b) illustrate the convergence of an algorithm for solving PCR optimal control problem according to one embodiment described herein.

FIG. 23 illustrates the predicted evolution of reaction species during a typical primer annealing reaction

DETAILED DESCRIPTION

Embodiments described herein can be understood more readily by reference to the following detailed description and examples and their previous and following descriptions. Elements, apparatus and methods described herein, however, are not limited to the specific embodiments presented in the detailed description and examples. It should be recognized that these embodiments are merely illustrative of the principles of the present invention. Numerous modifications and adaptations will be readily apparent to those of skill in the art without departing from the spirit and scope of the invention.

I. Annealing Efficiency

Methods described herein, in some embodiments, can enhance annealing efficiency for first and second strands of DNA over a predetermined time period or at a predetermined temperature. In some embodiments, a computer implemented method for enhancing annealing efficiency of first and second single strands of deoxyribonucleic acid (DNA) over a predetermined time period is described herein, the method comprising accessing a memory via a computing device processor, wherein the memory comprises computer instructions that when operated by the computing device processor cause the computing device processor to receive information comprising sequences of the first and second DNA strands, concentrations of the first and second DNA strands, length of the predetermined time period and a set of annealing reaction temperatures; determine the forward reaction rate constant, k_(f), of annealing the first and second single strands of DNA at annealing reaction temperatures in the set; determine the reverse reaction rate reaction rate constant, k_(r), of annealing the first and second single strands of DNA at annealing reaction temperatures in the set; and select one or more of the annealing reaction temperatures sufficient to produce a DNA concentration from the annealed first and second strands in excess of a predetermined threshold within the predetermined time period.

A non-limiting example is now presented to further illustrate the foregoing method enhancing annealing efficiency. The computer readable instructions cause the computing device to receive information comprising sequences of the first and second DNA strands, concentrations of the first and second DNA strands, length of the predetermined time period and a set of annealing reaction temperatures. This data in the present non-limiting example is provided in Table I.

TABLE I Information received by the processor Information Category Assigned Value Sequence 1^(st) DNA strand A₁₀ Sequence 2^(nd) DNA strand U₁₀ Concentration 1^(st) DNA strand 40 μM Concentration 2^(nd) DNA strand 40 μM Length of predetermined time period 1 s Set of annealing reaction temperatures 3.4° C., 7.7° C., 12.6° C., 16.9° C. and 23.3° C. From this data set, the forward reaction rate constant, k_(f), and the reverse reaction rate constant, k_(r), are determined in terms of the equilibrium constant and relaxation time for DNA hybridization as follows. First, k_(f)/k_(r) is expressed as K_(eq)(T), which is determined according to the equation (1):

$\begin{matrix} {{K_{eq}(T)} = {{\exp \left( \frac{{- \Delta}\; {G(T)}}{RT} \right)} = \frac{k_{f}}{k_{r}\left\lbrack C_{0} \right\rbrack}}} & (1) \end{matrix}$

wherein ΔG of the entire sequence (here A₁₀:U₁₀) is obtained from the nearest neighbor (NN) method for DNA melting thermodynamics. The nearest neighbor method provides the ΔG for pairs of nucleotide bases in the input sequence (here AA:UU). The ΔG for the entire sequence employed in the expression of K_(eq)(T) is obtained by summation of these free energies and correction factors for end effects.

σ(T) is a resistance to formation of the first few base pairs, which constitute the hybridization nucleus. Following determination of K_(eq)(T), σ(T) is obtained according to the equation (2):

$\begin{matrix} {{K_{eq}(T)} = {{\sigma (T)}\left( {\prod\limits_{i = 1}^{N}{s_{i}(T)}} \right)}} & (2) \end{matrix}$

Wherein N denotes the sequence length (here 10), each s_(i) is determined according to the nucleotide base pair as set forth in equations (3) and (4).

$\begin{matrix} {{s_{AT}(T)} = {1.04 \times 10^{- 5}{\exp \left( \frac{8000}{RT} \right)}}} & (3) \\ {{s_{CG}(T)} = {1.04 \times 10^{- 5}{\exp \left( \frac{8000}{RT} \right)}}} & (4) \end{matrix}$

As provided in equation (2), the s_(i) values are multiplied to obtain the product Π_(i=1) ^(N)S_(i)(T) and σ(T) is solved for using the product Π_(i=1) ^(N)S_(i) (T) and the value of K_(eq)(T) determined above according to equation (1).

With σ(T) determined, k_(l)(T), the forward rate constant for hybridization of a single base pair, is next obtained in order to set up a state space matrix for determining relaxation time, τ, for strand hybridization. Depending on DNA strand/sequence length, k_(l) can be obtained by various methods. For sequence lengths of up to 14 bases, k_(l)(T) can be obtained from a plot such as that illustrated in FIG. 1. FIG. 1 illustrates the effect of temperature on the rate constant, k_(l), for several different sequence lengths. More generally, k_(l)(T) for any given nucleotide sequence length can be estimated using experimental data, as will be described further below in this example.

The relaxation time for a given oligonucleotide sequence, which is the characteristic time constant for the system to return to equilibrium after a perturbation from equilibrium, can be calculated based on either one-sided or two-sided oligonucleotide melting models. FIG. 2 illustrates one- and two-sided melting. A detailed unsteady state balance of the chemical reaction network for DNA hybridization yields a set of coupled first order differential equations after linearization around the equilibrium:

${\frac{dx}{dt} = {Ax}},{{x(0)} = x_{init}}$

where x denotes the vector of deviations of individual base pair concentrations from equilibrium.

In one-sided melting, the matrix A takes the form

$A = \begin{bmatrix} {{- {\sigma \left\lbrack D_{0{eq}} \right\rbrack}}k_{1}} & k_{- 0} & 0 & 0 & \ldots \\ {{\sigma \left\lbrack D_{0{eq}} \right\rbrack}k_{1}} & {{- k_{1}} - k_{- 0}} & k_{- 1} & 0 & \ldots \\ 0 & k_{1} & {{- k_{1}} - k_{- 1}} & k_{2} & \ldots \\ 0 & 0 & k_{1} & {- k_{- 2}} & \ldots \\ \vdots & \vdots & \vdots & \vdots & \ddots \end{bmatrix}$

Two-sided melting models, which allow melting to occur simultaneously from both sides of the sequence, are the most accurate. The reaction mechanism R_(l) of the two-sided melting reaction is given below. For clarity, the mechanism for N=2 is presented.

D_(m,n) represents the state of a DNA molecule where m and n denote the number of base pairs melted from left and right side of the DNA molecule respectively. In reaction R_(l), the horizontal direction represents right-sided melting and the vertical direction represents left-sided melting.

The total number of possible states in an N base pair hybridization reaction is (N+1)×(N+2)/2. Out of these, there are N states that have only one base pair hybridized, in N different positions. The σ parameter is associated with the rate constant for formation of each of these N states. The number of states in one-sided melting is of O(N), whereas in the two-sided melting model, it is of O(N²). Using the convention that states with the same m value are grouped together, the state vector for two-sided melting is:

x=[D _(0,0),D_(0,1),D_(0,2), D_(1,0),D_(1,1), D_(2,0)]^(T)

and the state space matrix for the reaction R₁ is:

$\begin{matrix} {A = \begin{bmatrix} {- \left( {k^{- 1} + k_{- 1}} \right)} & k_{1} & 0 & k_{1} & 0 & 0 \\ k_{- 1} & {- \left( {k_{- 2} + k^{- 1} + k_{1}} \right)} & {{\sigma \left\lbrack D_{0,{2{eq}}} \right\rbrack}k_{1}} & 0 & {{\sigma \left\lbrack D_{1,{1{eq}}} \right\rbrack}k_{1}} & 0 \\ 0 & k_{- 2} & {{- {\sigma \left\lbrack D_{0,{2{eq}}} \right\rbrack}}k_{1}} & 0 & 0 & 0 \\ k^{- 1} & 0 & 0 & {- \left( {k_{- 1} + k^{- 2} + k_{1}} \right)} & {{\sigma \left\lbrack D_{1,{1{eq}}} \right\rbrack}k_{1}} & {{\sigma \left\lbrack D_{2,{0{eq}}} \right\rbrack}k_{1}} \\ 0 & k^{- 1} & 0 & k_{- 1} & {{- 2}{\sigma \left\lbrack D_{1,{1{eq}}} \right\rbrack}k_{1}} & 0 \\ 0 & 0 & 0 & k^{- 2} & 0 & {{- {\sigma \left\lbrack D_{2,{0{eq}}} \right\rbrack}}k_{1}} \end{bmatrix}} & (5) \end{matrix}$

Each diagonal block of this two-sided melting matrix corresponds to melting from the right side with a constant number m of base pairs dissociated from the left side. In the two-sided melting mechanism, there are N+1 states representing fully dissociated single-stranded DNA (m+n=N). These can be represented using one notation, D_(0,N). In doing this, rates that we counted twice in the above state space matrix (5) are corrected. For example, reaction R₁, D_(1,0) gives D_(1,0) and D_(2,0,) which represent the single-stranded state. Hence, the disappearance rate of the molecule D_(1,0) has been counted twice. Thus, from the above state matrix (A), two redundant states (in general N states) should be removed and the rates that were counted twice should also be corrected to obtain the correct state space matrix for two-sided melting. Note the total number of states (N+1)×(N+2)/2−N. Therefore, the Corrected Reaction Mechanism (CRM) R₂ for two-sided melting (considering N−2) is given as:

The state space matrix for two-sided melting (N=2) based on the above modified reaction mechanism is thus given by:

$\begin{matrix} {A = \begin{bmatrix} {- \left( {k^{- 1} + k_{- 1}} \right)} & k_{1} & k_{1} & 0 \\ k_{- 1} & {- \left( {k_{- 2} + k_{1}} \right)} & 0 & {{\sigma \left\lbrack D_{0,{2{eq}}} \right\rbrack}k_{1}} \\ k^{- 1} & 0 & {- \left( {k^{- 2} + k_{1}} \right)} & {{\sigma \left\lbrack D_{0,{2{eq}}} \right\rbrack}k_{1}} \\ 0 & k_{- 2} & k^{- 2} & {{- 2}{\sigma \left\lbrack D_{0,{2{eq}}} \right\rbrack}k_{1}} \end{bmatrix}} & (6) \\ {x = \left\lbrack {D_{0,0},D_{0,1},D_{1,0},D_{0,2}} \right\rbrack^{T}} & \; \end{matrix}$

The relaxation time, τ, can be calculated from the state matrix (6). For example, with τ(T) determined for A10:U10 as described above, k_(i) determined from FIG. 1, state space matrix (6) is solved for the largest eigenvalue, λ_(max). λ_(max) is related to τ though equation (7):

τ=−1/λ_(max)   (7)

With the relaxation time now obtained, k_(f) and k_(r) are obtained by simultaneously solving equation (1) in conjunction with equation (8):

τ=1/(k _(f) [S _(2eq) +S _(1eq) ]+k _(r))   (8)

wherein S_(1eq) and S_(2eq) are equilibrium concentrations of the first and second DNA single strands. FIG. 3 illustrates comparison between theoretically and experimentally estimates forward (k_(f)) and reverse (k_(r)) for single stranded (S₁, S₂) hybridization reaction. With k_(f) and k_(r) known, DNA concentration in excess of a predetermined threshold can be determined for a set of annealing reaction temperatures within a predetermined time period according to equation (9):

d[D]/dt=k _(f) [S ₁ ][S ₂ ]−k _(r) [D]  (9)

Similarly, DNA concentration in excess of a predetermined threshold can be determined for a set of annealing reaction time periods at a predetermined annealing reaction temperature according to equation (9). FIG. 4 illustrates determination of DNA concentration in excess of a predetermined threshold for a set of annealing reaction temperatures and time periods.

As noted above, k_(l)(T) for DNA sequence length in excess of 14 bases can be estimated based on experimental data. Estimation of k_(l) is a nonlinear parameter estimation problem in which a functional relationship between model parameters and experimentally measured dependent variable is available. For a known set of relaxation time vs. equilibrium concentration of single strands data, a least square objective function equal to the sum of squared differences between actual and predicted relaxation time can be minimized to estimate k_(l). Given a set of τ vs. [D_(eq)] data, such a parameter estimation problem can be formulated and k_(l) can be estimated at a specific temperature. For a fixed temperature, when the number base pairs, N, of a given sequence increases, k_(l) should increase. This can be explained with more physical arguments and reasoning. For a fixed temperature, a sequence with more base pairs is more stable than a sequence with less base pairs. The individual stability constant s_(i) is equal to the ratio k_(l)/k_(−i). Though at a fixed temperature, s is a constant since a shorter sequence should be higher and k_(−l) should be lower than those of a long sequence. k_(l) for the transient models can be calculated by using the characteristic equation of the state space matrix A, with the negative reciprocal of the relaxation time τ substituted in it as the eigenvalue λ, for the functional relationship:

τ=f([D] _(eq) , σ, S _(AT) , S _(GC) , k _(l) , N)   (10)

Since the state space matrix is relatively large, a bisection method is implemented that is described below to solve for k_(l) based on the transient models.

-   -   Assume an interval, [a, b] for k_(l). Note that at a, the         eigenvalue of the A should be greater than the actual eigenvalue         (λ_(actual)) and at b, it should be less than the actual         eigenvalue.     -   Let c=(a+b)/2     -   Find the eigenvalue of A at c.     -   If (λ_(c)−λ_(actual)<0), b=c, else a=c     -   Repeat until (λ_(c)−λ_(actual)<0)≈0.

Using the above algorithm for the transient models for the steady state model, IQ was determined for four different (self-complementary) sequences A4U4:A4U4, A5U5:A5U5, A6U6:A6U6 and A7U7:A7U7, using their relaxation time and initial concentration of single strands data. FIG. 1 illustrates the variation of k_(l) with respect to temperature for the sequences A5U5:A5U5 and A7U7:A7U7. The variation of apparent rate constant IQ with respect to temperature follows an Arrhenius relationship, and it is similar to k_(f) with a negative apparent activation energy that is a function of actual activation energy and the heat of base pair formation. FIG. 6 illustrates the variation of the activation energy for k_(l) with respect to N. As detailed above, when N increases, k_(l) increases, and hence the negative apparent activation energy decreases. For a fixed N, though a difference in apparent activation energy for homogeneous and heterogeneous sequences is expected, this difference may lie within an expected error bound. Therefore, a common k_(l) can be estimated for heterogeneous sequences as long as their lengths are the same. More experiments can be conducted with different compositions and lengths of DNA sequences to estimate k_(l) using least squares estimation rather than non-linear root finding.

II. PCR Amplification Efficiency

In another aspect, methods for enhancing amplification efficiency of a DNA sequence in one or more cycles of PCR are described herein. In some embodiments, a computer implemented method for enhancing amplification efficiency of a sequence of deoxyribonucleic acid (DNA) in a first cycle of polymerase chain reaction (PCR) is described herein, the method comprising accessing a memory via a computing device processor, where the memory comprises computer instructions that when operated by the computing device processor causes the computing device processor to (a) receive information comprising sequences of single strand DNA and primer, concentrations of the single strand DNA and primer, length of a predetermined time period for annealing the single strand DNA and primer and a set of annealing reaction temperatures; (b) determine the forward reaction rate constant, k_(f), of annealing the single strand DNA and primer at annealing reaction temperatures in the set; and (c) determine the reverse reaction rate reaction rate constant, k_(r), of annealing the single strand DNA and primer at reaction temperatures in the set. If desired, one or more of the annealing reaction temperatures sufficient to produce a primer hybridized DNA strand concentration in excess of a predetermined threshold within the predetermined time period for annealing can be selected. Alternatively, the extension kinetics described below can be determined and one or more of the annealing reaction temperature required for production of DNA in excess of a predetermined threshold can be selected.

In some embodiments, the computing device further (a′) determines a time period for duplex DNA melting sufficient to produce a concentration of single strand DNA in excess of a predetermined threshold. Additionally, in some embodiments, the computing device processor further (e) determines a time period for DNA extension sufficient to produce a concentration of DNA in excess of a predetermined threshold.

A. Primer-Template Annealing

As described above, a kinetic model of PCR includes kinetic models of melting, annealing, enzyme binding and extension reactions. The reaction below represents an annealing reaction of the single strands (S) and primers (P).

${S + P}\underset{k_{r\;}}{\overset{k_{f\;}}{\leftrightarrow}}{SP}$

A procedure similar to that described in Section I above is employed to determine k_(f), k_(r) for S-P annealing and SP concentration at a set of reaction temperatures and times. The overall Gibbs free energy and hence the equilibrium constant K_(annealing) for a given sequence at the chosen annealing temperature can be calculated using the NN model described in Section I herein.

$\begin{matrix} {K_{annealing} = {\frac{k_{f}}{k_{r}} = {\exp \left( {- \frac{\Delta \; G_{annealing}}{RT}} \right)}}} & (12) \end{matrix}$

Following determination of K_(annealing), a state space matrix is established for determining annealing relaxation time. The state space matrix can be set up as described in Section I herein employing equations (2)-(4) and (5)-(6) in conjunction with reaction mechanisms R₁ and R₂. The state space matrix is solved for the largest eigenvalue, λ_(max), and τ is obtained from equation (7). With the relaxation time now obtained, k_(f) and k_(r) are obtained by simultaneously solving equation (12) in conjunction with equation (13):

$\begin{matrix} {\tau = \frac{1}{{k_{f}\left( {\left\lbrack S_{eq} \right\rbrack + \left\lbrack P_{eq} \right\rbrack} \right)} + k_{r}}} & (13) \end{matrix}$

wherein P_(eq) and S_(eq) are equilibrium concentrations of the primer and single stranded DNA. With k_(f) and k_(r) known, primer hybridized DNA strand concentration in excess of a predetermined threshold can be determined for a set of annealing reaction temperatures within a predetermined time period according to equation (14):

d[SP]/dt=k _(f) [S][P]−k _(r) [SP]  (14)

Similarly, primer hybridized DNA strand concentration in excess of a predetermined threshold can be determined for a set of annealing reaction time periods at a predetermined reaction temperature according to equation (14). As described herein, k_(f) and k_(r) can be used solely for optimizing primer-DNA strand hybridization. FIG. 23 depicts the predicted evolution of reaction species for a typical primer annealing reaction with reaction temperature 20 deg C and initial single strand and primer concentrations either 1 nM or 10 nM. Primer sequence 1 is 5′ GCTAGCTGTAACTG 3′ and primer sequence 2 is 5′ GTCTGCTGAAACTG,3′. Alternatively, k_(f) and k_(r) can be used in conjunction with DNA extension rate constants in determining a temperature cycling profile for enhancing amplification efficiency of one or more cycles of PCR.

B. Enzyme Binding and Extension

In the extension step, enzymatic addition of nucleotides converts the primer-template duplex (SP) into a complete dsDNA. There are several reaction mechanisms proposed for enzyme binding and extension reactions and FIG. 7 represents a general reaction scheme. In step 1, enzyme binds with D_(i) molecule (primer-template with i added nucleotide bases) to form a binary complex E.D_(i). In step 2, a deoxynucleotide triphosphate (dNTP) binds with E.D_(i) to form a ternary complex, E.Di.dNTP, which undergoes a protein conformational change in step 3 and forms E′.D_(i).dNTP. In step 4, a the nucleotide is incorporated and a pyrophosphate molecule is released from D_(i) and as a result E′.D_(i+1).PP_(i) is formed. E′.D_(i+1).PP_(i) undergoes a conformational change in step 5 leading to the formation of E.D_(i+1).PP_(i). In step 6, PP_(i) is completely released from E.D_(i+1).PP_(i) and E.D_(i+1) is formed. Finally the dissociation of E.D_(i+1) produces D_(i+1) and E. Besides these steps, there are parallel dissociation reactions represented by step 7 and 8 may also occur.

The final step 6 is irreversible with a rate constant k₆. Steps 3, 4 and 5 which are all a first order reversible reactions represent the conformational change of a ternary complex and their rate constants values are higher than k₂ and the forward rate constants of each step is higher than the reverse rate constant. Hence, the overall dynamics is controlled by step 2 which forms a ternary complex, E.D_(i).dNTP and the final step is irreversible.

The reaction mechanism of FIG. 7 can be represented using the simplified reaction schemes given by reaction R₃.

$\begin{matrix} {{E + D_{i}}\underset{k_{- e}}{\overset{k_{e}}{\rightleftharpoons}}{{E \cdot D_{i}} + {dNTP}}\underset{k_{- 2}}{\overset{k_{2}}{\rightleftharpoons}}{E \cdot D_{i} \cdot {dNTP}}\overset{k_{cat}}{}{{E \cdot D_{i + 1}} + {PP}_{i}}\underset{k_{e}}{\overset{k_{- e}}{\rightleftharpoons}}{E + D_{i + 1}}} & \left( R_{3} \right) \end{matrix}$

Kinetic parameters k_(e), k_(−e), k_(cat) and K_(N)=(k⁻²±k_(cat))/k₂ in R₃ can be estimated for any polymerase using polymerase processivity and initial rate experiments and these temperature-dependent rate constants can be used, for example, to determine the optimal extension time(s) that minimizes the total PCR reaction time required to reach a predetermined threshold concentration of DNA. FIG. 8 identifies the steps involved in enzyme binding and extension model parameter estimation according to some embodiments described herein. For the present analysis, the themostable DNA polymerase is Thermus aquatics (Taq) polymerase.

1) Taq Reaction Rates (K_(N), k_(cat)): Michaelis Menten Kinetics

In order to accurately evaluate the intrinsic kinetics of Taq, an experimental framework to quantify the rate of dNTP incorporation is established. In particular, a DNA substrate that represents a typical substrate in PCR applications and does not form secondary structure is employed.

Previous reports on DNA polymerase kinetics mostly employed two broad classes of substrates: a) activated genomic DNAs namely, the activated salmon sperm DNA or the activated Calf thymus DNA and b) primed M13 ssDNAs containing different primers. However, neither the M13 ssDNA nor the activated salmon sperm DNA represent typical substrates used in PCR or other biotechnology processes. In the majority of polymerase reactions the substrate is small to medium length ssDNA template (50 to 500 bases)—annealed generally to a small (10 to 20 bases long) open-ended (unobstructed for extension at the 3′ end) oligomeric primer. Activated salmon sperm and calf thymus DNAs are large genomic DNA, with extension sites flanked and/or blocked by large duplex structures. Polymerase assisted dNTP incorporation in this substrate involves gap filling and strand replacement extensions. This type of extension processes is not typical of common applications of the polymerase extension reactions like the PCR. For different reasons, the large M13 ssDNA templates with molecular size of the order of 7.25 kilobases, also do not represent common PCR templates. It is likely that such a large genomic DNA will harbor secondary structures, which, if present, would inhibit or slow down the extension reactions. Presence and prevalence of secondary structures that may fall on the path of the primer extension will depend on the location on the template where the primer is attached and as such differences in apparent extension rates would be expected based on particular pairs of primed M13 used in the experiments. As with templates, secondary structures in the primer may also interfere with extension reactions.

The above discussion points to the fact that intrinsic reaction rates of a DNA polymerase cannot be accurately determined by using either the activated salmon sperm DNA or a ssDNA/primer complex where either the ssDNA template or the primer are prone to forming secondary structures. In addition, in order that reaction kinetics can be studied both at lower and higher temperatures, such as temperatures ranging from 50° C. to 75° C., the primer should have a T_(m) equal to or higher than the than the highest temperature to be studied. These difficulties are overcome by using a synthetic template-primer pair, comprising of a 80-mer synthetic ssDNA with an overall GC content of 47.5% and designed to show minimal secondary structure and yet containing a 17-nucleotide stretch of high GC-containing sequence to accommodate a complementary 17-mer primer with high T_(m) (88% GC, T_(m)=80° C.), so that the kinetics of the reaction can be studied at different temperatures and up to or beyond the temperature at which the enzyme is supposed to exhibit its highest activity.

The single stranded TAT2 template (80 nt, 47.5% GC) was primed by using 7 equivalents of its 17-mer forward primer TAP1 (88% GC) in 1× Taq buffer (20 mM Tris-HCl, 50 mM KCl) containing 2 mM MgCl₂. The mixture was heated to 80° C. and cooled to room temperature by successively holding the temperatures and cooling as follows: hold at 95° C. for 1.5 minute, cool to 75° C. over 30 minutes, hold at 75° C. for 5 minutes (for slow primer annealing), cool to 70° C. over 20 minutes, cool to 45° C. over 45 minutes, and then cool to room temperature on the bench. The mixture was aliquoted in reaction tubes and Taq polymerase was added followed by dNTPs so that the final concentrations in each 20 μl reaction volume were 200 nM template (5.0 ng/μl), 1400 nM primer (7.24 ng/μl), 2 μM taq (0.056 U/μl) and various concentration of dNTPs. Studies of the bireactant Michaelis Menten (MM) reaction kinetics for Taq were performed at various extension temperatures from 55° C. to 75° C., at 5° C. intervals. Reactions at each temperature were setup with 12 different concentrations of dNTP ranging from 2-1000 μM while keeping free MgCl₂ concentration constantly at 2 mM, compensating for the chelation effect by dNTPs. Triplicate reaction tubes were first equilibrated in the CFX96 PCR machine at the extension temperature for 30 min before adding the dNTPs to start the reaction. Reactions were terminated at 0.33, 0.5, 0.67, 0.83, 1, 2, 3.5, 5, 6.5, 8.5 and 10 min by chilling to 4° C. and adding EDTA to a final concentration of 20 mM. Aliquots of 20 μl reaction mixture were then mixed with 80 μl of 1:178 dilution (in 1× TE buffer) of the PicoGreen Reagent. Fluorescence was measured in Fluoroskan Ascent Fluorescence 96/384 Well Plate Reader (485 nm excitation and 520 nm emission) after an incubation of 5 min at 37° C. The quantities of dNTP incorporated at each temperature and for each concentration were then plotted against time and the initial rates of the reactions at each temperature and each concentration were determined by fitting the reaction curve into a first-order reaction model.

FIG. 9 illustrates the Michaelis-Menten plots of the activity of the Taq polymerase at various temperatures. The template used was the 80-mer synthetic ssDNA TAT2 annealed to a 17-mer ssDNA TAP1 at the 3′end. Both the template and the primer have been tested to be free of secondary structures. The individual points in the curves were obtained from time course experiments at each dNTP concentration at each temperature. The fluorescence intensity data was converted into nM of dNTP incorporated from a calibration curve. The kinetic parameters calculated from FIG. 9 are tabulated in Table 2. k_(cat) denotes the catalytic rate constant for nucleotide addition and K_(N) denotes the Michealis constant for nucleotide addition.

TABLE 2 Values of K_(N); k_(cat); and k_(cat)/K_(N) for Taq polymerase at various temperature. Temperature/° C. K_(N)/μM k_(cat)/(sec⁻¹) k_(cat)/K_(N)/(μM⁻¹ sec⁻¹) 50 140 ± 10 12.7 ± 0.4 0.09 ± 0.01 55 250 ± 30 36 ± 2 0.14 ± 0.02 60 200 ± 20 49 ± 2 0.25 ± 0.02 65 220 ± 20 71 ± 2 0.33 ± 0.03 70 120 ± 10 79 ± 3 0.65 ± 0.07 75 91 ± 8 83 ± 2 0.91 ± 0.09

2) Calculate kcat/Kn at the Desired Temperatures

$\frac{k_{cat}}{K_{N}}$

for Taq polymerase is estimated at various temperatures based on the bireactants MM kinetics formulation above. FIG. 10 exhibits the temperature dependent apparent second order extension rate constant

$\frac{k_{cat}}{K_{N}}.$

3) Run Single-Hit Condition Extension Experiments on the Polymerase at the Different Temperatures

In single hit conditions, enzyme concentrations are sufficiently low that the probability of re-association is approximately zero. Therefore, they do not allow enzyme re-association. Hence enzyme-template association occurs only during the initial equilibration of enzyme with SP. Thus, the following reaction scheme for the addition of n base pairs is written

$\begin{matrix} {{{E + {D_{0}({SP})}}\underset{k_{- e}}{\overset{k_{e}}{\rightleftharpoons}}{{E \cdot D_{0}} + N}\underset{k_{- 2}}{\overset{k_{2}}{\rightleftharpoons}}{E \cdot D_{0} \cdot {N\overset{k_{cat}}{}E} \cdot D_{1}}}{{E + {{D_{i}\overset{k_{{- e}\;}}{}E} \cdot D_{i}} + N}\underset{k_{- 2}}{\overset{k_{2}}{\rightleftharpoons}}{{E \cdot D_{i} \cdot {N\overset{k_{cat}}{}E} \cdot {D_{i + 1}\overset{k_{- e}}{}E}} + D_{i + 1}}}} & \left( R_{4} \right) \\ {{{E + {{D_{n - 1}\overset{k_{- e}}{}E} \cdot D_{n - 1}} + N}\underset{k_{- 2}}{\overset{k_{2}}{\rightleftharpoons}}{{E \cdot D_{n - 1} \cdot {N\overset{k_{cat}}{}E} \cdot {D_{n}\overset{k_{cat}^{\prime}}{}E}} + D_{n}}}\mspace{20mu} {{i = 1},2,{{\ldots \mspace{14mu} n} - 1}}} & \; \end{matrix}$

Single hit conditions are used to estimate polymerase processivity parameters. Processivity is defined as the average number of nucleotide incorporated per DNA-enzyme binding event. Gel electrophoretic analysis of the distribution of lengths of partially extended primer template DNAs Di in a single hit condition experiment provides the probability of dissociation of the polymerase at each position/time i , which can then can be used to calculate the processivity as described in 4).

4) Determine Processivity of the Polymerase

Let i index the sequence positions on the template. In a Markov chain formulation of dissociation, the index i at which dissociation occurs is denoted i_(off). Let p denote the conditional probability of the polymerase not dissociating at position/time i, given that it was bound to the template at position/time i−1. The probability of dissociation at position/time i is then

p _(off)(i)=(1−p)p^(i−1)   (15)

p is called the microscopic processivity parameter. The expected position of dissociation of the polymerase (expected stopping time) sometimes called the processivity, can be written as

$\begin{matrix} {{E\left\lbrack i_{off} \right\rbrack} = \frac{1}{1 - p}} & (16) \end{matrix}$

E[i_(off)] is generally reported as the processivity instead of the microscopic processivity parameter. The above expression is derived for, a template of infinite length. Usually, in processivity experiments long templates are used to estimate p. For finite length,

$\begin{matrix} {{p_{off}(i)} = {1 - {\sum\limits_{i = 0}^{n - 1}{\left( {1 - p} \right){p^{i - 1}.}}}}} & (17) \end{matrix}$

For heterogeneous templates, p will vary with position. From processivity experiments, one can obtain the p at each position since we will have

$\begin{matrix} {{p_{off}(i)} = {\left( {1 - p_{i}} \right){\prod\limits_{j = 0}^{i - 1}{p_{j}{\forall i}}}}} & (18) \end{matrix}$

These equations can be used to solve uniquely for each p_(i). However, it is impractical to do processivity experiments for each new template. Hence, processivity experiments on templates with different types of nearest neighbor motifs (including hairpins) can be conducted for a given polymerase, and then nearest neighbor processivity parameters can be used in modeling of an arbitrary sequence.

The following Table provides the values of E[i_(off)] and the conditions at which they are measured:

Nucleotide Temperature concentration Reference (° C.) [N] (μM) E [i_(off)] Wang et al (36) 72 250 22 Davidson et al (37) 60 800 50-80

5) Relate to Nucleotide Addition Rate Constants and Processivity

Now, at a fixed temperature, we seek a relationship between processivity of an enzyme and the rate constants of the reaction scheme R₄. In order to do this, we write the state space model for the reaction scheme R₄. We omit

${E \cdot {D_{n}\overset{k_{cat}^{\prime}}{}E}} + D_{n}$

from the state space model for simplicity as it does not affect equilibrium and we are not estimating the corresponding rate constant. Since the substrate, dNTP, concentration is always in excess compared to enzyme, Michaelis-Menten (MM) kinetics is valid and hence the steady state assumption for the intermediate concentration is valid. Therefore,

$\begin{matrix} {{{\frac{d}{dt}\left\lbrack {E \cdot D_{i} \cdot N} \right\rbrack} = {{{{- \left\lbrack {E \cdot D_{i} \cdot N} \right\rbrack}\left( {k_{- 2} + k_{cat}} \right)} + {{k_{2}\left\lbrack {E \cdot D_{i}} \right\rbrack}\lbrack N\rbrack}} = {\left. 0\Rightarrow\left\lbrack {E \cdot D_{i} \cdot N} \right\rbrack \right. = \frac{\left\lbrack {E \cdot D_{i}} \right\rbrack \lbrack N\rbrack}{K_{N}}}}}\mspace{20mu} {{{where}\mspace{14mu} K_{N}} = {\frac{k_{- 2} + k_{cat}}{k_{2}}.}}} & (19) \end{matrix}$

Through analysis of the state space model, it is found that

$\begin{matrix} {{p = \frac{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}{{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack} + k_{- e}}}{Therefore}{\frac{k_{- e}}{{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack} + k_{- e}} = {\left. {1 - p}\Rightarrow k_{- e} \right. = \frac{{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left( {1 - p} \right)}{p}}}} & (20) \end{matrix}$

Equation (20) can be written as follows in terms of processivity:

$\begin{matrix} {k_{- e} = {{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left( \frac{1}{{E\left\lbrack i_{off} \right\rbrack} - 1} \right)}} & (21) \end{matrix}$

As per equation (21), if

$\frac{k_{cat}}{K_{N}}$

and processivity of a polymerase is known at a specific temperature, it is possible to estimate k⁻¹ for any polymerase. equation (21) is valid under the approximations applied in the derivation of (R3). In equation (21) k_(−e) and

$\frac{k_{cat}}{K_{N}}$

are concentration independent terms and hence, the processivity E[i_(off)] or the conditional probability p depends on [N].

Using the above values, k_(cat)/K_(N) and equation (21), k_(−e) at 60° C. and 72° C., has been estimated, respectively. The k_(−e) for S. solfataricus P2 DNA polymerase B1(36) at 37° C. is known and the same value for Taq polymerase enzyme is employed as their equilibrium constants are of the same order of magnitude. Thus, k_(−e) can be obtained at three different temperatures and hence an Arrhenius relationship is fitted as shown in FIG. 11 to estimate the temperature dependent dissociation rate constant k_(−e). k_(cat)′, the rate constant for dissociation of DNA from E.D_(n) that is depicted in reaction scheme (R4), is approximated by k_(−e).

6. Solve for k_(e)(T)

Equilibrium thermodynamic analysis for the polymerase enzyme binding reaction has been studied extensively with the Thermus aquaticus enzyme. Such studies estimated the temperature dependent equilibrium constant, which is the ratio of k_(e)/k_(−e). Therefore, based on the temperature dependence of k_(−e) and K_(binding), k_(e) has been estimated and its Arrhenius plot is shown in FIG. 12. For the specified extension temperature T and each of the extension times considered, the amount DNA produced by solving the system of extension rate equations for PCR is predicted, using the values of kcat/Kn (T), k_(e)(T), and k_(−e)(T) determined above in conjunction with equations (22) and (23).

$\begin{matrix} {{\frac{d}{dt}\left\lbrack {E \cdot {SP}} \right\rbrack} = {{{k_{e}\lbrack E\rbrack}\lbrack{SP}\rbrack} - {k_{- e}\left\lbrack {E \cdot {SP}} \right\rbrack} - {{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left\lbrack {E \cdot {SP}} \right\rbrack}}} & (22) \end{matrix}$

where [N] is the nucleotide concentration.

$\begin{matrix} {{\frac{d}{dt}\left\lbrack {E \cdot D_{i}} \right\rbrack} = {{{k_{e}\lbrack E\rbrack}\left\lbrack D_{i} \right\rbrack} - {k_{- e}\left\lbrack {E \cdot D_{i}} \right\rbrack} + {{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left\{ {\left\lbrack {E \cdot D_{i - 1}} \right\rbrack - \left\lbrack {E \cdot D_{i}} \right\rbrack} \right\}}}} & (23) \end{matrix}$

wherein [E.SP] is concentration of enzyme bound hybridized DNA strand concentration and [N] is the nucleotide concentration. In view of the foregoing derivation and analysis, the following non-limiting examples are provided.

EXAMPLE 1 Determination of Annealing Temperature (1^(st) PCR Cycle)

Input Data:

Denaturation temperature is 90° C., denaturation time is 30 sec. Extension temperature is 72° C. Annealing and extension times are both 30 sec. The set of possible annealing temperatures is provided in the FIG. 13. We can set any predetermined threshold DNA concentration value and for the specified annealing time (30 sec), find the annealing temperature at which the curve crosses this threshold graphically—this sets annealing temperature. The sequences of the primers are P1=5′ GCTAGCTGTAACTG 3′; P2−5′ GTCTGCTGAAACTG 3′. S1 and S2 are 500 bp sequences with terminal primer binding sites fully complementary to the sequences above. Primer, enzyme and dNTP concentrations are 0.2 μM, 2 nM and 800 μM respectively. Consistent with the rate equations described hereinabove, a detailed set of rate equations employed in this example are as follows:

Melting

${\frac{d}{dt}\lbrack{DNA}\rbrack} = {{{k_{- m}\left\lbrack S_{1} \right\rbrack}\left\lbrack S_{2} \right\rbrack} - {k_{m}\lbrack{DNA}\rbrack}}$

Annealing

$\begin{matrix} {{{{\frac{d}{dt}\left\lbrack {S_{j}P_{j}} \right\rbrack} = {{{k_{f}^{j}\left\lbrack S_{j} \right\rbrack}\left\lbrack P_{j} \right\rbrack} - {k_{r}^{j}\left\lbrack {S_{j}P_{j}} \right\rbrack} - {{k_{e}\lbrack E\rbrack}\left\lbrack {S_{j}P_{j}} \right\rbrack} + {k_{- e}\left\lbrack {{E \cdot S_{j}}P_{j}} \right\rbrack}}},\mspace{20mu} {j = 1},2}\mspace{20mu} {{{\frac{d}{dt}\left\lbrack P_{j} \right\rbrack} = {{- {{k_{f}^{j}\left\lbrack S_{j} \right\rbrack}\left\lbrack P_{j} \right\rbrack}} + {k_{r}^{1}\left\lbrack {S_{j}P_{j}} \right\rbrack}}},{j = 1},2}} & \; \\ {{{\frac{d}{dt}\left\lbrack S_{j} \right\rbrack} = {{{k_{f}^{1}\left\lbrack S_{j} \right\rbrack}\left\lbrack P_{j} \right\rbrack} - {k_{r}^{j}\left\lbrack {S_{j}P_{j}} \right\rbrack} - {{k_{f}^{*}\left\lbrack S_{1} \right\rbrack}\left\lbrack S_{2} \right\rbrack} + {k_{r}^{*}\left\lbrack {S_{1}S_{2}} \right\rbrack}}},{j = 1},2} & \; \end{matrix}$

Employing the analytical steps set forth in Sections I and II(A) hereinabove, k_(f) and k_(r) for each primer are determined for the annealing reaction at the three annealing temperatures of 30° C., 35° C. and 40° C. provided in FIG. 13. From this, the primer hybridized DNA strand concentration can be determined at any time. Such information can be used in the solution of the system of rate equations governing the entire PCR cycle to select annealing temperature(s) and/or time(s) for enhancing amplification efficiency. For example, a temperature for DNA extension is selected and k_(e) and k_(−e) are determined employing the analytical steps set forth in Sections I and II(B) hereinabove. Consistent with the rate equations described hereinabove, a detailed set of rate equations for enzyme binding and extension employed in this example are as follows:

Enzyme Binding and Extension

$\mspace{20mu} {{{\frac{d}{dt}\left\lbrack {{E \cdot S_{j}}P_{j}} \right\rbrack} = {{{k_{e}\lbrack E\rbrack}\left\lbrack {S_{j}P_{j}} \right\rbrack} - {k_{- e}\left\lbrack {{E \cdot S_{j}}P_{j}} \right\rbrack} - {k_{cat}\frac{\left\lbrack {{E \cdot S_{j}}P_{j}} \right\rbrack \lbrack N\rbrack}{K_{N}}}}},\mspace{20mu} {j = 1},2}$ $\mspace{20mu} {{{\frac{d}{dt}\left\lbrack {S_{j}P_{j}} \right\rbrack} = {{k_{- e}\left\lbrack {{E \cdot S_{j}}P_{j}} \right\rbrack} - {{k_{e}\left\lbrack {S_{j}P_{j}} \right\rbrack}\lbrack E\rbrack}}},{j = 1},2}$ ${{\frac{d}{dt}\left\lbrack {E \cdot D_{i}^{j}} \right\rbrack} = {{{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left\{ {\left\lbrack {E \cdot D_{i - 1}^{j}} \right\rbrack - \left\lbrack {E \cdot D_{i}^{j}} \right\rbrack} \right\}} - {k_{- e}\left\lbrack {E \cdot D_{i}^{j}} \right\rbrack} + {{k_{e}\left\lbrack D_{i}^{j} \right\rbrack}\lbrack E\rbrack}}},\mspace{20mu} {i = 1},\ldots \mspace{14mu},{{n - 1};{j = 1}},2$ $\mspace{20mu} {{{\frac{d}{dt}\left\lbrack D_{i}^{j} \right\rbrack} = {{k_{- e}\left\lbrack {E \cdot D_{i}^{j}} \right\rbrack} - {{k_{e}\left\lbrack D_{i}^{j} \right\rbrack}\lbrack E\rbrack}}},{i = 1},\ldots \mspace{14mu},{{n - 1};{j = 1}},2}$ $\mspace{20mu} {{{\frac{d}{dt}\left\lbrack {E \cdot D_{n}^{j}} \right\rbrack} = {{{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left\lbrack {E \cdot D_{n - 1}^{j}} \right\rbrack} - {k_{cat}^{\prime}\left\lbrack {E \cdot D_{n}^{j}} \right\rbrack}}},{j = 1},2}$ $\mspace{20mu} {{\frac{d}{dt}\lbrack{DNA}\rbrack} = {k_{cat}^{\prime}\left\{ {\left\lbrack {E \cdot D_{n}^{1}} \right\rbrack + \left\lbrack {E \cdot D_{n}^{2}} \right\rbrack} \right\}}}$ ${\frac{d}{dt}\lbrack N\rbrack} = {{- {\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}}\left\{ {\left\lbrack {{E \cdot S_{1}}P_{1}} \right\rbrack + \left\lbrack {{E \cdot S_{2}}P_{2}} \right\rbrack + {\sum\limits_{i = 1_{1}}^{n - 1}\left( {\left\lbrack {E \cdot D_{i}^{1}} \right\rbrack + \left\lbrack {E \cdot D_{i}^{2}} \right\rbrack} \right)}} \right\}}$

From this, the DNA concentration can be determined at any reaction time. Combining the solved rate equations of annealing at the three different temperatures with the solved extension kinetics provides solutions to the system of rate equations governing the entire PCR cycle to select annealing temperature(s) and/or time(s) for enhancing amplification efficiency. Alternatively the complete set of rate constants can be calculate at both the annealing and extension temperatures using the analytical methods in Sections I and II and the complete set of rate equations above can be solved simultaneously for both annealing and extension steps.

FIG. 14 illustrates solutions to the system of rate equations for the entire PCR cycle for the three different annealing temperatures for the primer set. From FIG. 14, it can be seen that annealing temperatures of 30° C. and 35° C. provide greater concentrations of final DNA product than an annealing temperature of 40° C. It can also been seen from FIG. 14 that a desired threshold of DNA concentration can be pre-selected and the PCR cycle run at a temperature and for a time period for achieving the selected DNA threshold.

EXAMPLE 2 Calculation of Extension Time (Multiple PCR Cycles)

Input data:

The thermostable DNA polymerase is Taq polymerase. The primers are the same as those used in Example 1. The set of possible extension times is 30 seconds to 120 seconds in intervals of 0.1 second. The method for optimal extension time determination for three different PCR denaturation/annealing protocols is illustrated: a) 30 s denaturation time at 90° C.; 45 s annealing time at 35 C; b) 30 s denaturation time at 90° C.; 45 s annealing time at 40 C; c) 30 s denaturation time at 90° C.; 120 s annealing time at 40° C. For the purposes of this example the predetermined extension temperature is set to 72° C. The threshold DNA concentration is 100 nM. The sequences of the primers are the same as in Example 1, template length is 500 bp. Species concentrations are the same as in Example 1.

Annealing rate constants for primer hybridized DNA strand concentration are determined as set forth in Example 1 and/or Sections I and II(A) hereinabove. Similarly, extension rate constants are determined as set forth in Example 1 and/or Sections I and II(B) hereinabove. With these rate constants determined, the optimal extension time is determined as follows.

Let 0≤η≤1 be the efficiency in the first PCR cycle and the n(η) be the number of cycles in stage 1, then

$\begin{matrix} {{\left( {1 + \eta} \right)^{n} = y}{where}{y = \frac{\lbrack{DNA}\rbrack_{n}}{\lbrack{DNA}\rbrack_{o}}}} & (24) \end{matrix}$

and [DNA]_(n) is the concentration of DNA after n cycles and [DNA]_(o) is the initial concentration of DNA. From equation (30), the number of cycles n(η) can be expressed as follows:

$\begin{matrix} {{n(\eta)} = \frac{\log (y)}{\log \left( {1 + n} \right)}} & (25) \end{matrix}$

Let t(η) be the time required for a cycle to achieve an efficiency of η and the overall reaction time for state 1 be t_(total) (η); then

$\begin{matrix} {{t_{total}(\eta)} = {{{n(\eta)}{t(\eta)}} = {{t(\eta)}\frac{\log (y)}{\log \left( {1 + \eta} \right)}}}} & (26) \end{matrix}$

t(n) is computed by solving the annealing and extension rate equations listed above. Now, the following optimization problem to minimize the overall reaction time t_(total) is considered:

$\begin{matrix} {{\min\limits_{\eta}{t_{total}(\eta)}} = {\min\limits_{\eta}{{t(\eta)}\frac{\log (y)}{\log \left( {1 + \eta} \right)}}}} & (27) \end{matrix}$

Hence, η is sought such that

$\begin{matrix} {\frac{d\left( {{t(\eta)}\frac{\log (y)}{\log \left( {1 + \eta} \right)}} \right)}{d\; \eta} = {{{0\frac{1}{t(\eta)}}\frac{dt}{d\; \eta}} = \frac{1}{\left( {1 + \eta} \right)\left( {\log \left( {1 + \eta} \right)} \right)}}} & (28) \end{matrix}$

The above equation is solved graphically by plotting the left hand side and the right hand side of the above equation with respect to η. From the given DNA concentration profile, it is possible to estimate the optimal efficiency (η*) and hence t(η*) that minimizes the overall reaction time for geometric growth in state 1 subject to the specified constraints.

It can be observed that given this sample set, it is possible to reduce the overall reaction time at the specific reaction temperatures. FIG. 15 compares the geometric growth of DNA with respect to reaction time under the above three reaction conditions. FIG. 16 illustrates the modified temperature protocol based on this analysis. Further, FIG. 17 illustrates optimal cycle efficiency at the three different reaction conditions. The optimal extension times corresponding to the three examples described above are found to be a) 27.8 seconds, b) 24.5 seconds and c) 15.3 seconds.

EXAMPLE 3 Solution of DAE for Determination of Optimal PCR Annealing/Extension Protocol (Fixed Cycle Time)

The temperature cycling protocol for PCR can be automatically determined by solving an optimal control problem. The optimal control problem, in some embodiments, is of the following form, which maximizes DNA product concentration at a fixed final time:

${\min\limits_{T{(t)}}J},{J = {\sum\limits_{i = 1}^{{4n} + 4}{x_{i}\left( t_{f} \right)}}}$

subject to the constraint

${\frac{dx}{dt} = {f\left( {{x(t)},{T(t)},k} \right)}},{{x(0)} = x_{0}}$

where T(t) denotes the temperature of the reaction mixture as a function of reaction time, x=[[S₁],[S₁P₁],[E.S₁P₁],[D₁ ¹],[E.D₁ ¹], . . . , [DNA], [P₁], [P₂], [N], [E]]^(Tr) is the vector of concentrations of all reaction species, x₀ denotes the vector of initial concentrations, t_(f) denotes the predetermined time for a single PCR cycle, and f(x(t),T(t),k) denotes the right-hand side of the system of rate equations for primer annealing, enzyme binding and polymerase-mediated extension and k is a vector of all kinetic rate constants appearing in the system of PCR rate equations. f (x(t),T (t), k) can be expressed in terms of the sequence-dependent rate constants k_(i) appearing in

$\left\lbrack {k_{m},k_{- m},k_{f}^{1},k_{r}^{2},k_{f}^{2},k_{r}^{2},k_{e},k_{- e},\frac{k_{cat}}{K_{N}},k_{cat}^{\prime}} \right\rbrack \mspace{14mu} {as}\mspace{14mu} {\sum\limits_{i}{{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}$

gi(x(t)) denotes the part of the right-hand side of the system of rate equations for DNA melting, primer annealing, enzyme binding and polymerase-mediated extension associated with rate constant k_(i). The k_(i) T(t)) are determined according to the analytical methods described hereinabove in Sections I and II. Then, the PCR dynamic model can be expressed as

$\begin{matrix} {{\frac{{dx}(t)}{dt} = {\sum\limits_{i = 1}^{10}{{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}}{{{k \in \bullet^{10}};{k \geq 0};x},{{g_{i}(x)} \in \bullet^{{4n} + 9}},{{x \geq 0};{{h(x)} \in \bullet^{5}}},{{h(x)} = 0}}{k = \left\lbrack {k_{m},k_{- m},k_{f}^{1},k_{r}^{1},k_{f}^{2},k_{r}^{2},k_{1}^{e},k_{- 1}^{e},\frac{k_{cat}}{K_{N}},k_{cat}^{\prime}} \right\rbrack}} & (29) \end{matrix}$

where n denotes the number of base pairs, g₁(x) to g₂(x) represent melting alone, g₃(x) to g₆(x) represent the annealing reactions alone and g₇(x) to g₁₀(x) represent the extension reactions and h(x) denotes a set of 5 independent nonlinear constraints enforcing mass balance, which cause the system to evolve on a state manifold X⊂

^(4n+9) that is of dimension 4n+4. Hence the dimension of the state manifold is also sequence dependent. Additional constraints, including equality constraints may also be applied to the vector of rate constants k. Therefore, the equality and inequality constraints on controls are sequence-dependent. As a result of this, the k_(i)g_(i) in in equation (29) are also sequence-dependent.

For n=2,

x=[[S₁],[S₁P₁],[E.S₁P₁],[D₁ ¹],[E.D₁ ¹],[E.D₂ ¹],[S₂],[S₂P₂],[E.S₂P₂],[D₁ ²],[E.D₁ ²],[E.D₂ ²],[DNA],[P₁],[P₂],[N],[E]]^(Tr).

g(x) for a PCR reaction with n=2 is presented in FIG. 18. Additional equality constraints for the state variables are given below.

$\begin{matrix} {{h_{1}(x)} = {x_{14}^{o} - x_{1}^{o} + x_{1} - x_{14}}} & (30) \\ {{h_{2}(x)} = {x_{15}^{0} - x_{7}^{0} + x_{7} - x_{15}}} & (31) \\ {{h_{3}(x)} = {x_{1}^{0} + x_{7}^{0} - \left( {\sum\limits_{i = 1}^{13}x_{i}} \right) - {\left( \frac{x_{16}}{K_{N}} \right)\left( {\sum\limits_{{i = 3},5,9,11}x_{i}} \right)}}} & (32) \\ {{h_{4}(x)} = {\frac{x_{16}^{0} - \left( {\sum_{{i = 4},5,10,11}x_{i}} \right) - {2\left( {x_{6} + x_{12} + x_{13}} \right)}}{\left( {1 + \frac{1}{K_{N}}} \right)\left( {\sum_{{i = 5},11}x_{i}} \right)} - x_{16}}} & (33) \\ {{h_{5}(x)} = {x_{17}^{0} - {\left( {1 + \frac{x_{16}}{K_{N}\;}} \right)\left( {\sum\limits_{{i = 3},5,9,11}x_{i}} \right)} - x_{6} - x_{12}}} & (34) \end{matrix}$

The additional algebraic constraint T₁<T<T₂ is imposed on the dynamic optimization problem, where T₁ and T₂ denote predetermined minimum and maximum allowed reaction temperatures (e.g. 30° C. and 95° C.) respectively.

Input Data:

A fixed time PCR optimal control problem with the primer sequences mentioned above and the same reaction conditions (length of target=500 base pairs) has been solved as an example. The reaction time is fixed to be 75 seconds. The denaturation step is not optimized and is set to 95° C./30 s. The lower and upper bounds T₁ and T₂ on the temperature are set to 30° C. and 95° C., respectively.

The optimization problem is formulated as a system of differential-algebraic equations (DAE) of:

$\begin{matrix} {\mspace{20mu} {{{{{H\left\lbrack {{x(t)},{\varphi (t)},{T(t)}} \right\rbrack} = {{\varphi^{Tr}(t)}{\sum\limits_{i = 1}^{10}{{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}}}{\frac{d\; \varphi}{dt} = {{- {\nabla_{x}{H\left\lbrack {{x(t)},{\varphi (t)},{T(t)}} \right\rbrack}}} = {- {\sum\limits_{i = 1}^{10}{{k_{i}\left( {T(t)} \right)}\left( \frac{\partial{g_{i}\left( {x(t)} \right)}}{\partial{x(t)}} \right)^{Tr}{\varphi (t)}}}}}}}\mspace{20mu} {{\varphi \left( t_{f} \right)} = {\left\lbrack {\nabla_{x}{F\left( {x(t)} \right)}} \right\rbrack_{t = t_{f}} = \left\lbrack {\nabla_{x}{\sum\limits_{i = 1}^{{4n} + 4}{x_{i}(t)}}} \right\rbrack_{t = t_{f}}}}}{{{\frac{\partial H}{\partial{T(t)}}\left\lbrack {{x^{*}(t)},{\varphi^{*}(t)},{T^{*}(t)}} \right\rbrack} = {\left. {{\varphi^{*{,{Tr}}}(t)}{\sum\limits_{i = 1}^{10}\frac{\partial{k_{i}\left( {T(t)} \right)}}{\partial{T(t)}}}} \middle| {}_{T^{*}{(t)}}{g_{i}\left( {x^{*}(t)} \right)} \right. = 0}},\mspace{20mu} {\forall{t \in \left\lbrack {0,t_{f}} \right\rbrack}}}}} & (35) \end{matrix}$

H[x(t), φ(t), T(t)] is the PMP-Hamiltonian function. φ represents the co-state (a function of time similar to a Lagrange multiplier and analogous to momentum in Hamiltonian mechanics).

FIG. 19 illustrates a flow chart for the Control Vector Iteration (CVI) method that was used to solve the optimal control problem. With reference to FIG. 19, the following steps are provided.

-   1. For a given initial trajectory of T₀(t) and initial condition     x_(o), the state equations are integrated from t₀ to t_(f) and x(t)     is obtained. -   2. Boundary condition for the co-state vector is obtained using     equation:

$\begin{matrix} {{{\frac{\partial H}{\partial{T(t)}}\left\lbrack {{x^{*}(t)},{\varphi^{*}(t)},{T^{*}(t)}} \right\rbrack} = {\left. {{\varphi^{*{,{Tr}}}(t)}{\sum\limits_{i = 1}^{10}\frac{\partial{k_{i,{seq}}\left( {T(t)} \right)}}{\partial{T(t)}}}} \middle| {}_{T^{*}{(t)}}{g_{i}\left( {x^{*}(t)} \right)} \right. = 0}},\mspace{20mu} {\forall{t \in \left\lbrack {0,t_{f}} \right\rbrack}}} & (36) \end{matrix}$

-   3. Co-state equations are integrated backwards from t_(f) to t₀ and     φ(t) -   4. Using x(t), φ(t), and T₀(t), at each t,

$\frac{\partial H}{\partial{T(t)}}$

was found.

-   5. Using x(t_(f)), the objective function value J was found. -   6. If J≤J_(desired) and

${{\frac{\partial H}{\partial{T(t)}}} \leq \epsilon},$

the iterative process was stopped at this point, otherwise the iteration was continued until the next step.

-   7. Golden section line search method was followed to estimate the     optimal α(t).

With the available state and co-state variables, the manipulated variable is updated based on the steepest descent methods such that the nominal control profile satisfies equation (36). The following equation describes the steepest descent method.

$\begin{matrix} {{T_{i + 1}(t)} = {{T_{i}(t)} - {\alpha \frac{\partial H}{\partial{T(t)}}}}} & (37) \end{matrix}$

Where α is a steepest descent parameter which is generally assumed to be a constant in a conventional CVI. In order to accelerate the process, a can be estimated through line search via the following minimization.

$\begin{matrix} {{\min\limits_{\alpha}\; {F\left( {\alpha (t)} \right)}} = {H\left\lbrack {{x(t)},{\varphi (t)},{{T(t)} - {\alpha \frac{dH}{dt}(t)}}} \right\rbrack}} & (38) \end{matrix}$

A golden section search algorithm is employed for the above minimization.

-   8. The control profile is updated as per equation (37). -   9. With the updated control profile, step 1 is reinitiated and other     steps are followed until the iteration satisfies the convergence     criteria,

$J \leq {0.05\mspace{14mu} {and}\mspace{14mu} {\frac{\partial H}{\partial{T(t)}}}} \leq 10^{- 4}$

FIGS. 20 and 21 show the optimal annealing and extension temperature profile and the corresponding optimal DNA concentration profile by solving the PCR optimal control problem by CVI. The effciency at the optimal operating condition is 96%. The optimal control solutions T(t) for stage 1 are periodic. Hence the T(t) from FIG. 21 can be applied repeatedly for the first 14 PCR cycles. These are optimal PCR cycling strategies for fixed cycle time. FIG. 22 depicts the convergence of the algorithm.

Alternatively, the optimal control problem, in some embodiments, is of the form:

${\min\limits_{T{(t)}}\; J},{J = t_{f}}$

subject to the constraints

$\begin{matrix} {{\frac{dx}{dt} = {f\left( {{x(t)},{T(t)},k} \right)}},{{x(0)} = x_{0}}} & \left. {iii} \right) \\ {{\lbrack{DNA}\rbrack \left( t_{f} \right)} = \lbrack{DNA}\rbrack_{target}} & \left. {ii} \right) \end{matrix}$

where T(t) denotes the temperature of the reaction mixture as a function of reaction time, is the vector of concentrations of all reaction species, t_(f) denotes the predetermined time for a single PCR cycle, and f(x(t),T(t),k) denotes the right-hand side of the system of rate equations for primer annealing, enzyme binding and polymerase-mediated extension and k is a vector of all kinetic rate constants appearing in the system of rate equations. As described above, f(x(t),T(t),k) can be expressed in terms of the sequence-dependent rate constants k_(i) appearing

${{in}\;\left\lbrack {k_{m},k_{- m},k_{f}^{1},k_{r}^{1},k_{f}^{2},k_{r}^{2},k_{e},k_{- e},\frac{k_{cat}}{K_{N}},k_{cat}^{\prime}} \right\rbrack}\mspace{14mu} {as}\mspace{14mu} {\sum\limits_{i}\; {{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}$

In such embodiments, the optimal control problem is expressed as the DAE of:

${H\;\left\lbrack {{x(t)},{T(t)},{\varphi (t)}} \right\rbrack} = {{{L\;\left\lbrack {{x(t)},{T(t)}} \right\rbrack} + {{\varphi^{Tr}(t)}{\sum\limits_{i = 1}^{10}\; {{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}}} = {1 + {{\varphi^{Tr}(t)}{\sum\limits_{i = 1}^{10}\; {{k_{i}\left( {T(t)} \right)}{g_{i}\left( {x(t)} \right)}}}}}}$ $\mspace{79mu} {\frac{d\; \varphi}{dt} = {{- {\nabla_{x}H}} = {- {\sum\limits_{i = 1}^{10}\; {{k_{i}\left( {T(t)} \right)}\left( \frac{\partial{g_{i}\left( {x(t)} \right)}}{\partial{x(t)}} \right)^{Tr}{\varphi (t)}}}}}}$ ${{\frac{\partial H}{\partial{T(t)}}\left\lbrack {{x^{*}(t)},{\varphi^{*}(t)},{T^{*}(t)}} \right\rbrack} = {{{{\varphi^{*{,{Tr}}}(t)}{\sum\limits_{i = 1}^{10}\; \frac{\partial{k_{i}\left( {T(t)} \right)}}{\partial{T(t)}}}}_{T^{*}{(t)}}{g_{i}\left( {x^{*}(t)} \right)}} = 0}},{\forall{t \in \left\lbrack {0,t_{f}} \right\rbrack}}$      φ_(i)(t_(f)) = 0, i = 1, …  , 4n + 4 $\mspace{79mu} {{{x_{{4n} + 5}\left( t_{f} \right)} = x_{{{4n} + 5},f}},{{{x_{{{4n} + 5},f} > {2^{m}{{x_{{4n} + 5}(0)}.\mspace{79mu} H}}}_{t_{f}}} = {{{0\; \; {\varphi^{Tr}\left( t_{f} \right)}}{\sum\limits_{i = 1}^{10}\; {{k_{i}\left( {T\left( t_{f} \right)} \right)}{g_{i}\left( {x\left( t_{f} \right)} \right)}}}} = {- 1}}}}$

where x_(4n+5,f)=[DNA]_(f) and m denotes the number of cycles for which the time optimal solution is required. Thus, if m=1, x_(4n+5,f)≥2x_(4n+5)(0), which implies that more than one cycle will be generated; the single cycle time optimal temperature profile is obtained by considering T(t) up to the switching time between cycles. Equations (35) and (39) can be solved using any differential-algebraic equation solver.

Various embodiments of the invention have been described in fulfillment of the various objects of the invention. It should be recognized that these embodiments are merely illustrative of the principles of the present invention. Numerous modifications and adaptations thereof will be readily apparent to those skilled in the art without departing from the spirit and scope of the invention. 

1. A system for determining extension temperature and time in a temperature cycling protocol for a PCR reaction to achieve amplification of a DNA concentration in excess of a predetermined threshold, the system comprising: a memory device; and a processor coupled to the memory device, the processor configured to: a) receive information comprising concentrations of a hybridized primer-template DNA complex, SP, a type of DNA polymerase enzyme, E, a concentration of the DNA polymerase enzyme, and a concentration of nucleotide in the PCR reaction; b) determine a rate constant, k_(e)(T), of the DNA polymerase enzyme binding to the hybridized primer-template DNA as a function of temperature T; c) determine a rate constant, k_(−e)(T), of the DNA polymerase enzyme dissociation from the hybridized primer-template DNA as a function of temperature T; d) determine a Michaelis constant, K_(N)(T), for nucleotide binding to an enzyme-primer-template DNA complex, E.SP, as a function of temperature T; e) determine a catalytic rate constant, k_(cat)(T), of nucleotide addition to the enzyme-primer-template DNA complex, E.SP, as a function of temperature T; and f) determine (I) a temperature for DNA extension sufficient to produce a concentration of DNA in excess of the predetermined threshold within a predetermined extension time period or (II) a time period for DNA extension sufficient to produce a concentration of DNA in excess of the predetermined threshold at a predetermined extension temperature.
 2. The system of claim 1 wherein k_(e) and k_(−e) are determined according to a Gibbs free energy for the DNA polymerase binding reaction and a processivity of the DNA polymerase through the relationship: $k_{- e} = {{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left( \frac{1}{{E\left\lbrack i_{off} \right\rbrack} - 1} \right)}$ wherein [N] is the concentration of nucleotide, wherein i indexes the sequence position on the template, and wherein ${E\left\lbrack i_{off} \right\rbrack} = \frac{1}{1 - p}$ is the processivity or expected position of dissociation of the DNA polymerase, with p denoting a microscopic processivity parameter, the microscopic processivity parameter being the conditional probability of the polymerase not dissociating at position/time i, given that it was bound to the template at position i−1.
 3. The system of claim 2, wherein K_(N) and k_(cat) are experimentally or theoretically determined from a plot of 1/v versus 1/[N] wherein the slope of the plot is $\frac{K_{N}}{k_{cat}E_{0}}$ and the y-intercept is $\frac{1}{k_{cat}E_{0}},$ wherein E₀ is an initial concentration of DNA polymerase and v is an experimentally or theoretically determined rate of reaction for nucleotide addition.
 4. The system of claim 3, wherein determining the temperature or time period for DNA extension further comprises determining the enzyme-primer-template DNA complex concentration, [E.SP], according to: ${\frac{d}{dt}\left\lbrack {E.{SP}} \right\rbrack} = {{{k_{e}\lbrack E\rbrack}\lbrack{SP}\rbrack} - {k_{- e}\left\lbrack {E.{SP}} \right\rbrack} - {{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left\lbrack {E.{SP}} \right\rbrack}}$ wherein [N] is a concentration of the nucleotide.
 5. The system of claim 4, wherein determining the temperature or time period for DNA extension further comprises determining a concentration of enzyme-primer-template DNA with i added nucleotides, [E.D_(i)], according to: ${\frac{d}{dt}\left\lbrack {E.D_{i}} \right\rbrack} = {{{k_{e}\lbrack E\rbrack}\left\lbrack D_{i} \right\rbrack} - {k_{- e}\left\lbrack {E.D_{i}} \right\rbrack} + {{\frac{k_{cat}}{K_{N}}\lbrack N\rbrack}\left\{ {\left\lbrack {E.D_{i - 1}} \right\rbrack - \left\lbrack {E.D_{i}} \right\rbrack} \right\}}}$ wherein [N] is the concentration of nucleotide and [D_(i)] is a concentration of primer-template DNA with i added nucleotides.
 6. The system of claim 5 further comprising at least one input interface for entry of the information for the determination of the extension temperature and time
 7. The system of claim 5 further comprising at least one output interface for communicating the selected extension temperature and time to a user of the system or to an apparatus for conducting the extension reaction. 